MusicSarreV3, Main, Exploration, bibRecord, 000291

Automatic Transcription of Recorded Music

Identifieur interne : 000291 ( Main/Exploration ); précédent : 000290; suivant : 000292

Automatic Transcription of Recorded Music

Auteurs : Peter Grosche [Allemagne] ; Björn Schuller [Allemagne] ; Meinard Müller [Allemagne] ; Gerhard Rigoll [Allemagne]

Source :

Acta acustica united with acustica : (Print) [ 1610-1928 ] ; 2012.

RBID : Pascal:12-0406083

Descripteurs français

Pascal (Inist)
- Acoustique musicale, Enregistrement son, Tonie, Limite spectrale, Sensibilité contexte, Reproduction son, Emission sonore enregistrée, Synchronisation, Transcription automatique, Source sonore, Musique, Annotation, Durée, Transformation Fourier, Modèle Markov caché, Modèle Markov, Résolution temporelle, Analyse multirésolution, Acoustique audio.
Wicri :
- topic : Musique.

English descriptors

KwdEn :
- Annotation, Audio acoustics, Automatic transcription, Context aware, Duration, Fourier transformation, Hidden Markov model, Markov model, Multiresolution analysis, Music, Musical acoustics, Pitch(acoustics), Sound record, Sound recording, Sound reproduction, Sound source, Spectral limit, Synchronization, Time resolution.

Abstract

The automatic transcription of music recordings with the objective to derive a score-like representation from a given audio representation is a fundamental and challenging task. In particular for polyphonic music recordings with overlapping sound sources, current transcription systems still have problems to accurately extract the parameters of individual notes specified by pitch, onset, and duration. In this article, we present a music transcription system that is carefully designed to cope with various facets of music. One main idea of our approach is to consistently employ a mid-level representation that is based on a musically meaningful pitch scale. To achieve the necessary spectral and temporal resolution, we use a multi-resolution Fourier transform enhanced by an instantaneous frequency estimation. Subsequently, having extracted pitch and note onset information from this representation, we employ Hidden Markov Models (HMM) for determining the note events in a context-sensitive fashion. As another contribution, we evaluate our transcription system on an extensive dataset containing audio recordings of various genre. Here, opposed to many previous approaches, we do not only rely on synthetic audio material, but evaluate our system on real audio recordings using MIDI-audio synchronization techniques to automatically generate reference annotations.

Affiliations:

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000005
to stream PascalFrancis, to step Curation: 000009
to stream PascalFrancis, to step Checkpoint: 000005
to stream Main, to step Merge: 000291
to stream Main, to step Curation: 000291

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Automatic Transcription of Recorded Music</title>
<author><name sortKey="Grosche, Peter" sort="Grosche, Peter" uniqKey="Grosche P" first="Peter" last="Grosche">Peter Grosche</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>Saarland University and MPI Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Schuller, Bjorn" sort="Schuller, Bjorn" uniqKey="Schuller B" first="Björn" last="Schuller">Björn Schuller</name>
<affiliation wicri:level="3"><inist:fA14 i1="02"><s1>Institute for Human-Machine Communication, Technische Universitat München</s1>
<s2>80333 München</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="1">Bavière</region>
<region type="district" nuts="2">District de Haute-Bavière</region>
<settlement type="city">Munich</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>Saarland University and MPI Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Rigoll, Gerhard" sort="Rigoll, Gerhard" uniqKey="Rigoll G" first="Gerhard" last="Rigoll">Gerhard Rigoll</name>
<affiliation wicri:level="3"><inist:fA14 i1="02"><s1>Institute for Human-Machine Communication, Technische Universitat München</s1>
<s2>80333 München</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="1">Bavière</region>
<region type="district" nuts="2">District de Haute-Bavière</region>
<settlement type="city">Munich</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">12-0406083</idno>
<date when="2012">2012</date>
<idno type="stanalyst">PASCAL 12-0406083 INIST</idno>
<idno type="RBID">Pascal:12-0406083</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000005</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000009</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000005</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000005</idno>
<idno type="wicri:doubleKey">1610-1928:2012:Grosche P:automatic:transcription:of</idno>
<idno type="wicri:Area/Main/Merge">000291</idno>
<idno type="wicri:Area/Main/Curation">000291</idno>
<idno type="wicri:Area/Main/Exploration">000291</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Automatic Transcription of Recorded Music</title>
<author><name sortKey="Grosche, Peter" sort="Grosche, Peter" uniqKey="Grosche P" first="Peter" last="Grosche">Peter Grosche</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>Saarland University and MPI Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Schuller, Bjorn" sort="Schuller, Bjorn" uniqKey="Schuller B" first="Björn" last="Schuller">Björn Schuller</name>
<affiliation wicri:level="3"><inist:fA14 i1="02"><s1>Institute for Human-Machine Communication, Technische Universitat München</s1>
<s2>80333 München</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="1">Bavière</region>
<region type="district" nuts="2">District de Haute-Bavière</region>
<settlement type="city">Munich</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>Saarland University and MPI Informatik</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Rigoll, Gerhard" sort="Rigoll, Gerhard" uniqKey="Rigoll G" first="Gerhard" last="Rigoll">Gerhard Rigoll</name>
<affiliation wicri:level="3"><inist:fA14 i1="02"><s1>Institute for Human-Machine Communication, Technische Universitat München</s1>
<s2>80333 München</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="1">Bavière</region>
<region type="district" nuts="2">District de Haute-Bavière</region>
<settlement type="city">Munich</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Acta acustica united with acustica : (Print)</title>
<title level="j" type="abbreviated">Acta aucust. united Acust. : (Print)</title>
<idno type="ISSN">1610-1928</idno>
<imprint><date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Acta acustica united with acustica : (Print)</title>
<title level="j" type="abbreviated">Acta aucust. united Acust. : (Print)</title>
<idno type="ISSN">1610-1928</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Annotation</term>
<term>Audio acoustics</term>
<term>Automatic transcription</term>
<term>Context aware</term>
<term>Duration</term>
<term>Fourier transformation</term>
<term>Hidden Markov model</term>
<term>Markov model</term>
<term>Multiresolution analysis</term>
<term>Music</term>
<term>Musical acoustics</term>
<term>Pitch(acoustics)</term>
<term>Sound record</term>
<term>Sound recording</term>
<term>Sound reproduction</term>
<term>Sound source</term>
<term>Spectral limit</term>
<term>Synchronization</term>
<term>Time resolution</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Acoustique musicale</term>
<term>Enregistrement son</term>
<term>Tonie</term>
<term>Limite spectrale</term>
<term>Sensibilité contexte</term>
<term>Reproduction son</term>
<term>Emission sonore enregistrée</term>
<term>Synchronisation</term>
<term>Transcription automatique</term>
<term>Source sonore</term>
<term>Musique</term>
<term>Annotation</term>
<term>Durée</term>
<term>Transformation Fourier</term>
<term>Modèle Markov caché</term>
<term>Modèle Markov</term>
<term>Résolution temporelle</term>
<term>Analyse multirésolution</term>
<term>Acoustique audio</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Musique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">The automatic transcription of music recordings with the objective to derive a score-like representation from a given audio representation is a fundamental and challenging task. In particular for polyphonic music recordings with overlapping sound sources, current transcription systems still have problems to accurately extract the parameters of individual notes specified by pitch, onset, and duration. In this article, we present a music transcription system that is carefully designed to cope with various facets of music. One main idea of our approach is to consistently employ a mid-level representation that is based on a musically meaningful pitch scale. To achieve the necessary spectral and temporal resolution, we use a multi-resolution Fourier transform enhanced by an instantaneous frequency estimation. Subsequently, having extracted pitch and note onset information from this representation, we employ Hidden Markov Models (HMM) for determining the note events in a context-sensitive fashion. As another contribution, we evaluate our transcription system on an extensive dataset containing audio recordings of various genre. Here, opposed to many previous approaches, we do not only rely on synthetic audio material, but evaluate our system on real audio recordings using MIDI-audio synchronization techniques to automatically generate reference annotations.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
<region><li>Bavière</li>
<li>District de Haute-Bavière</li>
<li>Sarre (Land)</li>
</region>
<settlement><li>Munich</li>
<li>Sarrebruck</li>
</settlement>
</list>
<tree><country name="Allemagne"><region name="Sarre (Land)"><name sortKey="Grosche, Peter" sort="Grosche, Peter" uniqKey="Grosche P" first="Peter" last="Grosche">Peter Grosche</name>
</region>
<name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<name sortKey="Rigoll, Gerhard" sort="Rigoll, Gerhard" uniqKey="Rigoll G" first="Gerhard" last="Rigoll">Gerhard Rigoll</name>
<name sortKey="Schuller, Bjorn" sort="Schuller, Bjorn" uniqKey="Schuller B" first="Björn" last="Schuller">Björn Schuller</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000291 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000291 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sarre
   |area=    MusicSarreV3
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:12-0406083
   |texte=   Automatic Transcription of Recorded Music
}}

This area was generated with Dilib version V0.6.33.
Data generation: Sun Jul 15 18:16:09 2018. Site generation: Tue Mar 5 19:21:25 2024

	Serveur d'exploration sur la musique en Sarre
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la musique en Sarre

Automatic Transcription of Recorded Music

Automatic Transcription of Recorded Music

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri